AITopics | Cambria County

Chain-of-Thought (CoT) reasoning has significantly advanced state-of-the-art AI capabilities. However, recent studies have shown that CoT reasoning is not always faithful, i.e. CoT reasoning does not always reflect how models arrive at conclusions. So far, most of these studies have focused on unfaithfulness in unnatural contexts where an explicit bias has been introduced. In contrast, we show that unfaithful CoT can occur on realistic prompts with no artificial bias. Our results reveal non-negligible rates of several forms of unfaithful reasoning in frontier models: Sonnet 3.7 (16.3%), DeepSeek R1 (5.3%) and ChatGPT-4o (7.0%) all answer a notable proportion of question pairs unfaithfully. Specifically, we find that models rationalize their implicit biases in answers to binary questions ("implicit post-hoc rationalization"). For example, when separately presented with the questions "Is X bigger than Y?" and "Is Y bigger than X?", models sometimes produce superficially coherent arguments to justify answering Yes to both questions or No to both questions, despite such responses being logically contradictory. We also investigate restoration errors (Dziri et al., 2023), where models make and then silently correct errors in their reasoning, and unfaithful shortcuts, where models use clearly illogical reasoning to simplify solving problems in Putnam questions (a hard benchmark). Our findings raise challenges for AI safety work that relies on monitoring CoT to detect undesired behavior.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2503.08679

Country:

North America > United States > Nevada > Carson City (0.14)
North America > United States > Wisconsin > Sheboygan County > Sheboygan (0.14)
Asia > Middle East > Iraq (0.04)
(28 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (0.68)
Media > Film (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Vance roasts Walz over video game gaffe, needling former coach on football IQ

FOX NewsOct-29-2024, 19:31:10 GMT

Media figures and Democrats warned women will face dire consequences if former President Trump is re-elected. Ohio senator and Republican candidate for vice president JD Vance took aim at Minnesota Gov. Tim Walz's football IQ after the Democratic vice presidential nominee posted a confusing tweet during a livestream of himself playing Madden. "They parade Tim Walz around as some kind of football genius as a former football coach, and maybe I know more about football than Gov. Tim Walz does," Vance said during a rally in Saginaw, Michigan on Tuesday. The comments come after Walz teamed up with Rep. Alexandria Ocasio-Cortez, D-N.Y., to livestream a session of the two playing the Madden NFL video game against each other, an event that was reportedly an effort by the campaign to widen its appeal among young male voters. Democratic vice presidential candidate Tim Walz addresses the crowd at a "Native Americans for Harris-Walz" event at MGM Grand in Las Vegas, Sunday, Oct. 27, 2024.

football iq, vance roast walz, walz, (12 more...)

FOX News

Country:

North America > United States > Minnesota (0.29)
North America > United States > Nevada > Clark County > Las Vegas (0.27)
North America > United States > Ohio (0.26)
(2 more...)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Games (0.62)

Add feedback

Minimizing Live Experiments in Recommender Systems: User Simulation to Evaluate Preference Elicitation Policies

Hsu, Chih-Wei, Mladenov, Martin, Meshi, Ofer, Pine, James, Pham, Hubert, Li, Shane, Liang, Xujian, Polishko, Anton, Yang, Li, Scheetz, Ben, Boutilier, Craig

arXiv.org Artificial IntelligenceSep-25-2024

Evaluation of policies in recommender systems typically involves A/B testing using live experiments on real users to assess a new policy's impact on relevant metrics. This ``gold standard'' comes at a high cost, however, in terms of cycle time, user cost, and potential user retention. In developing policies for ``onboarding'' new users, these costs can be especially problematic, since on-boarding occurs only once. In this work, we describe a simulation methodology used to augment (and reduce) the use of live experiments. We illustrate its deployment for the evaluation of ``preference elicitation'' algorithms used to onboard new users of the YouTube Music platform. By developing counterfactually robust user behavior models, and a simulation service that couples such models with production infrastructure, we are able to test new algorithms in a way that reliably predicts their performance on key metrics when deployed live. We describe our domain, our simulation models and platform, results of experiments and deployment, and suggest future steps needed to further realistic simulation as a powerful complement to live experiments.

artist, proceedings, simulation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3661358

2409.17436

Country:

Oceania > Australia > Victoria > Melbourne (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
(18 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Media (0.68)
Information Technology > Services (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Videogames 'Fortnite,' 'Minecraft' Catapult Smiley Salamander to Global Fame

WSJ.com: WSJD - TechnologyMar-7-2022, 15:17:00 GMT

A global audience of a half-billion gamers have gotten to know the axolotl, which largely cluster in the canals around Mexico City and look like little dragons with a goofy smile. The videogame "Fortnite" trotted out axolotl characters in 2020, and "Minecraft" followed suit last summer. Roblox, a platform with millions of user-made games, has dozens of axolotl-centric ones, including "Axolotl Tycoon" and "Axolotl Paradise." Axolotls appear in "Adopt Me!," one of the most-played games on Roblox. All of the exposure has spawned axolotl memes, YouTube videos, coloring books and nonfungible tokens.

axolotl, catapult smiley salamander, minecraft, (13 more...)

WSJ.com: WSJD - Technology

Country:

North America > Mexico > Mexico City > Mexico City (0.28)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.06)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.05)
(5 more...)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Communications > Social Media (0.73)
Information Technology > Artificial Intelligence > Games > Computer Games (0.62)

Add feedback

AI-Powered Kiosks Help Spot COVID-19 in Pennsylvania

#artificialintelligenceAug-19-2020, 03:18:22 GMT

Businesses are installing new thermal imaging kiosks to check employees and visitors for high temperatures and even face coverings. Watkins Security, LLC, a Johnstown-based company that specializes in networks and security systems, introduced the technology in response to interest from the local business community that is continuing efforts to protect people from COVID-19. Watkins Security, LLC, President Christopher Watkins said it's a form of artificial intelligence or deep learning technology. "There is a paradigm brewing. The market is heading to these types of systems – fast response, automatic. Every little second matters," he said.

ai-powered kiosk help spot covid-19, artificial intelligence, machine learning, (6 more...)

#artificialintelligence

Country:

North America > United States > Pennsylvania > Cambria County (0.06)
Asia > China (0.06)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.77)
Health & Medicine > Therapeutic Area > Immunology (0.77)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

Concurrent Technologies Corporation awarded contract to research artificial intelligence

#artificialintelligenceMar-24-2017, 19:55:19 GMT

Johnstown, PA – The National Geospatial-Intelligence Agency (NGA) has awarded Concurrent Technologies Corporation (CTC) a competitively bid prime $498,000 contract to research artificial intelligence (AI) automation capabilities under the agency's Boosting Innovative GEOINT (BIG) program. Under this contract, CTC will further advance the state of knowledge in virtual assistants by evolving from a singular user- centric orientation to a viewpoint of the user within an interdependent network of humans and cognitive machines. The goal is to enable task automation and management to be cloud-wide rather than confined within a traditional desktop or server architecture; and augment and scale human analysts' cognition and intelligence via a capability refined for intelligence community needs. CTC will provide NGA an AI Analyst Virtual Assistant (AVA) capability, incorporating smart technology to address significant challenges in multi-tasking and processing large quantities of rapidly changing information. The AVA will provide recommendations and predictions based on analysts' needs and tasking and by socializing analysts' activities to one another.

concurrent technology corporation, contract, research artificial intelligence, (2 more...)

#artificialintelligence

Country: North America > United States > Pennsylvania > Cambria County > Johnstown (0.28)

Industry: Government (0.79)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

A Personalized System for Conversational Recommendations

Goker, M. H., Langley, P., Thompson, C. A.

arXiv.org Artificial IntelligenceJun-30-2011

Searching for and making decisions about information is becoming increasingly difficult as the amount of information and number of choices increases. Recommendation systems help users find items of interest of a particular type, such as movies or restaurants, but are still somewhat awkward to use. Our solution is to take advantage of the complementary strengths of personalized recommendation systems and dialogue systems, creating personalized aides. We present a system -- the Adaptive Place Advisor -- that treats item selection as an interactive, conversational process, with the program inquiring about item attributes and the user responding. Individual, long-term user preferences are unobtrusively obtained in the course of normal recommendation dialogues and used to direct future conversations with the same user. We present a novel user model that influences both item search and the questions asked during a conversation. We demonstrate the effectiveness of our system in significantly reducing the time and number of interactions required to find a satisfactory item, as compared to a control group of users interacting with a non-adaptive version of the system.

artificial intelligence, human computer interaction, user model, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1318

1107.0029

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Washington > King County > Seattle (0.04)
(37 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Consumer Products & Services > Restaurants (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

A Personalized System for Conversational Recommendations

Thompson, C. A., Goker, M. H., Langley, P.

Journal of Artificial Intelligence ResearchMar-1-2004

Searching for and making decisions about information is becoming increasingly difficult as the amount of information and number of choices increases. Recommendation systems help users find items of interest of a particular type, such as movies or restaurants, but are still somewhat awkward to use. Our solution is to take advantage of the complementary strengths of personalized recommendation systems and dialogue systems, creating personalized aides. We present a system -- the Adaptive Place Advisor -- that treats item selection as an interactive, conversational process, with the program inquiring about item attributes and the user responding. Individual, long-term user preferences are unobtrusively obtained in the course of normal recommendation dialogues and used to direct future conversations with the same user. We present a novel user model that influences both item search and the questions asked during a conversation. We demonstrate the effectiveness of our system in significantly reducing the time and number of interactions required to find a satisfactory item, as compared to a control group of users interacting with a non-adaptive version of the system.

interaction, proceedings, user model, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1318

AI Access Foundation

10374

Journal of Artificial Intelligence Research

Country: